Overview

Dataset statistics

Number of variables28
Number of observations5043
Missing cells2405
Missing cells (%)1.7%
Duplicate rows37
Duplicate rows (%)0.7%
Total size in memory1.1 MiB
Average record size in memory224.0 B

Variable types

Categorical12
Numeric16

Alerts

Dataset has 37 (0.7%) duplicate rowsDuplicates
director_name has a high cardinality: 2398 distinct values High cardinality
actor_2_name has a high cardinality: 3032 distinct values High cardinality
genres has a high cardinality: 914 distinct values High cardinality
actor_1_name has a high cardinality: 2097 distinct values High cardinality
movie_title has a high cardinality: 4917 distinct values High cardinality
actor_3_name has a high cardinality: 3521 distinct values High cardinality
plot_keywords has a high cardinality: 4760 distinct values High cardinality
movie_imdb_link has a high cardinality: 4919 distinct values High cardinality
country has a high cardinality: 65 distinct values High cardinality
num_critic_for_reviews is highly correlated with num_voted_users and 1 other fieldsHigh correlation
actor_3_fb_likes is highly correlated with actor_1_fb_likes and 2 other fieldsHigh correlation
actor_1_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
gross is highly correlated with num_voted_users and 2 other fieldsHigh correlation
num_voted_users is highly correlated with num_critic_for_reviews and 3 other fieldsHigh correlation
cast_total_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
num_user_for_reviews is highly correlated with num_critic_for_reviews and 2 other fieldsHigh correlation
budget is highly correlated with gross and 1 other fieldsHigh correlation
actor_2_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
num_critic_for_reviews is highly correlated with num_voted_users and 2 other fieldsHigh correlation
actor_3_fb_likes is highly correlated with actor_2_fb_likesHigh correlation
actor_1_fb_likes is highly correlated with cast_total_fb_likesHigh correlation
gross is highly correlated with num_voted_users and 1 other fieldsHigh correlation
num_voted_users is highly correlated with num_critic_for_reviews and 3 other fieldsHigh correlation
cast_total_fb_likes is highly correlated with actor_1_fb_likes and 1 other fieldsHigh correlation
num_user_for_reviews is highly correlated with num_critic_for_reviews and 2 other fieldsHigh correlation
actor_2_fb_likes is highly correlated with actor_3_fb_likes and 1 other fieldsHigh correlation
movie_fb_likes is highly correlated with num_critic_for_reviews and 1 other fieldsHigh correlation
num_critic_for_reviews is highly correlated with num_voted_users and 1 other fieldsHigh correlation
actor_3_fb_likes is highly correlated with cast_total_fb_likes and 1 other fieldsHigh correlation
actor_1_fb_likes is highly correlated with cast_total_fb_likes and 1 other fieldsHigh correlation
gross is highly correlated with num_voted_usersHigh correlation
num_voted_users is highly correlated with num_critic_for_reviews and 2 other fieldsHigh correlation
cast_total_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
num_user_for_reviews is highly correlated with num_critic_for_reviews and 1 other fieldsHigh correlation
actor_2_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
country is highly correlated with languageHigh correlation
language is highly correlated with countryHigh correlation
color is highly correlated with title_yearHigh correlation
num_critic_for_reviews is highly correlated with gross and 3 other fieldsHigh correlation
duration is highly correlated with language and 2 other fieldsHigh correlation
actor_3_fb_likes is highly correlated with gross and 2 other fieldsHigh correlation
actor_1_fb_likes is highly correlated with cast_total_fb_likesHigh correlation
gross is highly correlated with num_critic_for_reviews and 3 other fieldsHigh correlation
num_voted_users is highly correlated with num_critic_for_reviews and 5 other fieldsHigh correlation
cast_total_fb_likes is highly correlated with actor_3_fb_likes and 2 other fieldsHigh correlation
num_user_for_reviews is highly correlated with num_critic_for_reviews and 3 other fieldsHigh correlation
language is highly correlated with duration and 2 other fieldsHigh correlation
country is highly correlated with duration and 2 other fieldsHigh correlation
content_rating is highly correlated with duration and 2 other fieldsHigh correlation
budget is highly correlated with language and 1 other fieldsHigh correlation
title_year is highly correlated with color and 1 other fieldsHigh correlation
actor_2_fb_likes is highly correlated with cast_total_fb_likesHigh correlation
imdb_score is highly correlated with num_voted_users and 1 other fieldsHigh correlation
aspect_ratio is highly correlated with content_ratingHigh correlation
movie_fb_likes is highly correlated with num_critic_for_reviews and 1 other fieldsHigh correlation
director_name has 104 (2.1%) missing values Missing
director_fb_likes has 104 (2.1%) missing values Missing
gross has 677 (13.4%) missing values Missing
plot_keywords has 153 (3.0%) missing values Missing
content_rating has 303 (6.0%) missing values Missing
budget has 406 (8.1%) missing values Missing
title_year has 108 (2.1%) missing values Missing
aspect_ratio has 329 (6.5%) missing values Missing
budget is highly skewed (γ1 = 48.57751667) Skewed
movie_title is uniformly distributed Uniform
actor_3_name is uniformly distributed Uniform
plot_keywords is uniformly distributed Uniform
movie_imdb_link is uniformly distributed Uniform
director_fb_likes has 907 (18.0%) zeros Zeros
actor_3_fb_likes has 89 (1.8%) zeros Zeros
facenumber_in_poster has 2152 (42.7%) zeros Zeros
actor_2_fb_likes has 55 (1.1%) zeros Zeros
movie_fb_likes has 2181 (43.2%) zeros Zeros

Reproduction

Analysis started2022-01-14 01:03:22.794886
Analysis finished2022-01-14 01:04:05.607387
Duration42.81 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

color
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing19
Missing (%)0.4%
Memory size39.5 KiB
Color
4815 
Black and White
 
209

Length

Max length16
Median length5
Mean length5.457603503
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowColor
2nd rowColor
3rd rowColor
4th rowColor
5th rowColor

Common Values

ValueCountFrequency (%)
Color4815
95.5%
Black and White209
 
4.1%
(Missing)19
 
0.4%

Length

2022-01-14T02:04:05.686177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-14T02:04:05.769953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
color4815
88.5%
white209
 
3.8%
and209
 
3.8%
black209
 
3.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

director_name
Categorical

HIGH CARDINALITY
MISSING

Distinct2398
Distinct (%)48.6%
Missing104
Missing (%)2.1%
Memory size39.5 KiB
Steven Spielberg
 
26
Woody Allen
 
22
Clint Eastwood
 
20
Martin Scorsese
 
20
Ridley Scott
 
17
Other values (2393)
4834 

Length

Max length32
Median length13
Mean length13.08483499
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1504 ?
Unique (%)30.5%

Sample

1st rowJames Cameron
2nd rowGore Verbinski
3rd rowSam Mendes
4th rowChristopher Nolan
5th rowDoug Walker

Common Values

ValueCountFrequency (%)
Steven Spielberg26
 
0.5%
Woody Allen22
 
0.4%
Clint Eastwood20
 
0.4%
Martin Scorsese20
 
0.4%
Ridley Scott17
 
0.3%
Spike Lee16
 
0.3%
Tim Burton16
 
0.3%
Steven Soderbergh16
 
0.3%
Renny Harlin15
 
0.3%
Oliver Stone14
 
0.3%
Other values (2388)4757
94.3%
(Missing)104
 
2.1%

Length

2022-01-14T02:04:05.849739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
john180
 
1.8%
david150
 
1.5%
michael127
 
1.2%
james87
 
0.8%
peter85
 
0.8%
robert84
 
0.8%
paul81
 
0.8%
richard80
 
0.8%
scott65
 
0.6%
lee58
 
0.6%
Other values (2966)9277
90.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_critic_for_reviews
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct528
Distinct (%)10.6%
Missing50
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean140.194272
Minimum1
Maximum813
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:05.980389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q150
median110
Q3195
95-th percentile387
Maximum813
Range812
Interquartile range (IQR)145

Descriptive statistics

Standard deviation121.6016754
Coefficient of variation (CV)0.8673797701
Kurtosis2.91341641
Mean140.194272
Median Absolute Deviation (MAD)68
Skewness1.5165327
Sum699990
Variance14786.96746
MonotonicityNot monotonic
2022-01-14T02:04:06.110043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
143
 
0.9%
937
 
0.7%
536
 
0.7%
1035
 
0.7%
835
 
0.7%
1234
 
0.7%
8133
 
0.7%
1633
 
0.7%
4331
 
0.6%
2930
 
0.6%
Other values (518)4646
92.1%
(Missing)50
 
1.0%
ValueCountFrequency (%)
143
0.9%
226
0.5%
324
0.5%
429
0.6%
536
0.7%
628
0.6%
723
0.5%
835
0.7%
937
0.7%
1035
0.7%
ValueCountFrequency (%)
8131
< 0.1%
7751
< 0.1%
7651
< 0.1%
7502
< 0.1%
7391
< 0.1%
7381
< 0.1%
7331
< 0.1%
7231
< 0.1%
7121
< 0.1%
7032
< 0.1%

duration
Real number (ℝ≥0)

HIGH CORRELATION

Distinct191
Distinct (%)3.8%
Missing15
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean107.201074
Minimum7
Maximum511
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:06.282582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile81
Q193
median103
Q3118
95-th percentile146
Maximum511
Range504
Interquartile range (IQR)25

Descriptive statistics

Standard deviation25.19744081
Coefficient of variation (CV)0.235048399
Kurtosis22.56579716
Mean107.201074
Median Absolute Deviation (MAD)12
Skewness2.339134041
Sum539007
Variance634.9110233
MonotonicityNot monotonic
2022-01-14T02:04:06.420213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90161
 
3.2%
100141
 
2.8%
101139
 
2.8%
98135
 
2.7%
97131
 
2.6%
93129
 
2.6%
95124
 
2.5%
99124
 
2.5%
94124
 
2.5%
96113
 
2.2%
Other values (181)3707
73.5%
ValueCountFrequency (%)
72
 
< 0.1%
111
 
< 0.1%
141
 
< 0.1%
201
 
< 0.1%
227
0.1%
232
 
< 0.1%
242
 
< 0.1%
254
0.1%
271
 
< 0.1%
281
 
< 0.1%
ValueCountFrequency (%)
5111
< 0.1%
3341
< 0.1%
3301
< 0.1%
3251
< 0.1%
3001
< 0.1%
2931
< 0.1%
2891
< 0.1%
2861
< 0.1%
2801
< 0.1%
2711
< 0.1%

director_fb_likes
Real number (ℝ≥0)

MISSING
ZEROS

Distinct435
Distinct (%)8.8%
Missing104
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean686.5092124
Minimum0
Maximum23000
Zeros907
Zeros (%)18.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:06.549867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median49
Q3194.5
95-th percentile973
Maximum23000
Range23000
Interquartile range (IQR)187.5

Descriptive statistics

Standard deviation2813.328607
Coefficient of variation (CV)4.098020181
Kurtosis27.25628935
Mean686.5092124
Median Absolute Deviation (MAD)49
Skewness5.22970117
Sum3390669
Variance7914817.85
MonotonicityNot monotonic
2022-01-14T02:04:06.716421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0907
 
18.0%
370
 
1.4%
666
 
1.3%
764
 
1.3%
263
 
1.2%
460
 
1.2%
1159
 
1.2%
1053
 
1.1%
852
 
1.0%
552
 
1.0%
Other values (425)3493
69.3%
(Missing)104
 
2.1%
ValueCountFrequency (%)
0907
18.0%
263
 
1.2%
370
 
1.4%
460
 
1.2%
552
 
1.0%
666
 
1.3%
764
 
1.3%
852
 
1.0%
949
 
1.0%
1053
 
1.1%
ValueCountFrequency (%)
230001
 
< 0.1%
220008
 
0.2%
2100010
 
0.2%
200001
 
< 0.1%
180004
 
0.1%
1700020
0.4%
1600028
0.6%
150002
 
< 0.1%
1400030
0.6%
1300026
0.5%

actor_3_fb_likes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct906
Distinct (%)18.0%
Missing23
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean645.009761
Minimum0
Maximum23000
Zeros89
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:06.875994image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q1133
median371.5
Q3636
95-th percentile1000
Maximum23000
Range23000
Interquartile range (IQR)503

Descriptive statistics

Standard deviation1665.041728
Coefficient of variation (CV)2.581420979
Kurtosis60.56388811
Mean645.009761
Median Absolute Deviation (MAD)248.5
Skewness7.279020793
Sum3237949
Variance2772363.957
MonotonicityNot monotonic
2022-01-14T02:04:07.011632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000126
 
2.5%
089
 
1.8%
1100029
 
0.6%
328
 
0.6%
200027
 
0.5%
300026
 
0.5%
82622
 
0.4%
221
 
0.4%
421
 
0.4%
721
 
0.4%
Other values (896)4610
91.4%
(Missing)23
 
0.5%
ValueCountFrequency (%)
089
1.8%
221
 
0.4%
328
 
0.6%
421
 
0.4%
518
 
0.4%
618
 
0.4%
721
 
0.4%
817
 
0.3%
916
 
0.3%
1012
 
0.2%
ValueCountFrequency (%)
230002
 
< 0.1%
200001
 
< 0.1%
190005
 
0.1%
170001
 
< 0.1%
160003
 
0.1%
150001
 
< 0.1%
140006
 
0.1%
130005
 
0.1%
120008
 
0.2%
1100029
0.6%

actor_2_name
Categorical

HIGH CARDINALITY

Distinct3032
Distinct (%)60.3%
Missing13
Missing (%)0.3%
Memory size39.5 KiB
Morgan Freeman
 
20
Charlize Theron
 
15
Brad Pitt
 
14
Meryl Streep
 
11
James Franco
 
11
Other values (3027)
4959 

Length

Max length28
Median length13
Mean length13.07435388
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2089 ?
Unique (%)41.5%

Sample

1st rowJoel David Moore
2nd rowOrlando Bloom
3rd rowRory Kinnear
4th rowChristian Bale
5th rowRob Walker

Common Values

ValueCountFrequency (%)
Morgan Freeman20
 
0.4%
Charlize Theron15
 
0.3%
Brad Pitt14
 
0.3%
Meryl Streep11
 
0.2%
James Franco11
 
0.2%
Jason Flemyng10
 
0.2%
Adam Sandler10
 
0.2%
Bruce Willis9
 
0.2%
Robert Duvall9
 
0.2%
Angelina Jolie Pitt9
 
0.2%
Other values (3022)4912
97.4%
(Missing)13
 
0.3%

Length

2022-01-14T02:04:07.196138image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
michael102
 
1.0%
david60
 
0.6%
john56
 
0.5%
james53
 
0.5%
scott52
 
0.5%
tom50
 
0.5%
jason44
 
0.4%
robert44
 
0.4%
kevin41
 
0.4%
thomas39
 
0.4%
Other values (3825)9861
94.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

actor_1_fb_likes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct878
Distinct (%)17.4%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean6560.047061
Minimum0
Maximum640000
Zeros26
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:07.323797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile95.5
Q1614
median988
Q311000
95-th percentile24000
Maximum640000
Range640000
Interquartile range (IQR)10386

Descriptive statistics

Standard deviation15020.75912
Coefficient of variation (CV)2.289733439
Kurtosis683.5473559
Mean6560.047061
Median Absolute Deviation (MAD)752.5
Skewness19.12177638
Sum33036397
Variance225623204.5
MonotonicityNot monotonic
2022-01-14T02:04:07.560165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000449
 
8.9%
11000211
 
4.2%
2000197
 
3.9%
3000155
 
3.1%
12000135
 
2.7%
13000127
 
2.5%
14000123
 
2.4%
10000112
 
2.2%
18000109
 
2.2%
2200082
 
1.6%
Other values (868)3336
66.2%
ValueCountFrequency (%)
026
0.5%
28
 
0.2%
34
 
0.1%
42
 
< 0.1%
57
 
0.1%
63
 
0.1%
73
 
0.1%
81
 
< 0.1%
93
 
0.1%
101
 
< 0.1%
ValueCountFrequency (%)
6400001
 
< 0.1%
2600003
 
0.1%
1640002
 
< 0.1%
1370002
 
< 0.1%
870008
 
0.2%
770001
 
< 0.1%
4900027
0.5%
460001
 
< 0.1%
450005
 
0.1%
440002
 
< 0.1%

gross
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4224
Distinct (%)96.7%
Missing677
Missing (%)13.4%
Infinite0
Infinite (%)0.0%
Mean46720940.5
Minimum162
Maximum760505847
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:07.708767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum162
5-th percentile59477.75
Q14587414.5
median24004159
Q359548719.5
95-th percentile177318686.5
Maximum760505847
Range760505685
Interquartile range (IQR)54961305

Descriptive statistics

Standard deviation67365553.29
Coefficient of variation (CV)1.441870659
Kurtosis15.48153103
Mean46720940.5
Median Absolute Deviation (MAD)22300451
Skewness3.192180098
Sum2.039836262 × 1011
Variance4.538117771 × 1015
MonotonicityNot monotonic
2022-01-14T02:04:07.903248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000004
 
0.1%
1773436753
 
0.1%
80000003
 
0.1%
470000003
 
0.1%
264104773
 
0.1%
57735193
 
0.1%
2180512603
 
0.1%
349648183
 
0.1%
1445123103
 
0.1%
70000003
 
0.1%
Other values (4214)4335
86.0%
(Missing)677
 
13.4%
ValueCountFrequency (%)
1621
< 0.1%
4231
< 0.1%
6071
< 0.1%
7031
< 0.1%
7211
< 0.1%
7281
< 0.1%
8281
< 0.1%
10291
< 0.1%
10361
< 0.1%
11001
< 0.1%
ValueCountFrequency (%)
7605058471
< 0.1%
6586723021
< 0.1%
6521772711
< 0.1%
6232795472
< 0.1%
5333160611
< 0.1%
4745446771
< 0.1%
4609356651
< 0.1%
4589915991
< 0.1%
4481306421
< 0.1%
4364710361
< 0.1%

genres
Categorical

HIGH CARDINALITY

Distinct914
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
Drama
 
236
Comedy
 
209
Comedy|Drama
 
191
Comedy|Drama|Romance
 
187
Comedy|Romance
 
158
Other values (909)
4062 

Length

Max length64
Median length20
Mean length20.31310728
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique495 ?
Unique (%)9.8%

Sample

1st rowAction|Adventure|Fantasy|Sci-Fi
2nd rowAction|Adventure|Fantasy
3rd rowAction|Adventure|Thriller
4th rowAction|Thriller
5th rowDocumentary

Common Values

ValueCountFrequency (%)
Drama236
 
4.7%
Comedy209
 
4.1%
Comedy|Drama191
 
3.8%
Comedy|Drama|Romance187
 
3.7%
Comedy|Romance158
 
3.1%
Drama|Romance152
 
3.0%
Crime|Drama|Thriller101
 
2.0%
Horror71
 
1.4%
Action|Crime|Drama|Thriller68
 
1.3%
Action|Crime|Thriller65
 
1.3%
Other values (904)3605
71.5%

Length

2022-01-14T02:04:08.068804image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
drama236
 
4.7%
comedy209
 
4.1%
comedy|drama191
 
3.8%
comedy|drama|romance187
 
3.7%
comedy|romance158
 
3.1%
drama|romance152
 
3.0%
crime|drama|thriller101
 
2.0%
horror71
 
1.4%
action|crime|drama|thriller68
 
1.3%
action|crime|thriller65
 
1.3%
Other values (904)3605
71.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

actor_1_name
Categorical

HIGH CARDINALITY

Distinct2097
Distinct (%)41.6%
Missing7
Missing (%)0.1%
Memory size39.5 KiB
Robert De Niro
 
49
Johnny Depp
 
41
Nicolas Cage
 
33
J.K. Simmons
 
31
Bruce Willis
 
30
Other values (2092)
4852 

Length

Max length27
Median length13
Mean length13.19241461
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1360 ?
Unique (%)27.0%

Sample

1st rowCCH Pounder
2nd rowJohnny Depp
3rd rowChristoph Waltz
4th rowTom Hardy
5th rowDoug Walker

Common Values

ValueCountFrequency (%)
Robert De Niro49
 
1.0%
Johnny Depp41
 
0.8%
Nicolas Cage33
 
0.7%
J.K. Simmons31
 
0.6%
Bruce Willis30
 
0.6%
Denzel Washington30
 
0.6%
Matt Damon30
 
0.6%
Liam Neeson29
 
0.6%
Harrison Ford27
 
0.5%
Robin Williams27
 
0.5%
Other values (2087)4709
93.4%

Length

2022-01-14T02:04:08.205440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
robert109
 
1.0%
tom93
 
0.9%
michael89
 
0.9%
jason59
 
0.6%
de57
 
0.5%
james54
 
0.5%
bruce51
 
0.5%
steve50
 
0.5%
jr49
 
0.5%
niro49
 
0.5%
Other values (2888)9784
93.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

movie_title
Categorical

HIGH CARDINALITY
UNIFORM

Distinct4917
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
Pan 
 
3
King Kong 
 
3
Ben-Hur 
 
3
The Fast and the Furious 
 
3
Halloween 
 
3
Other values (4912)
5028 

Length

Max length87
Median length15
Mean length16.54967281
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4798 ?
Unique (%)95.1%

Sample

1st rowAvatar 
2nd rowPirates of the Caribbean: At World's End 
3rd rowSpectre 
4th rowThe Dark Knight Rises 
5th rowStar Wars: Episode VII - The Force Awakens 

Common Values

ValueCountFrequency (%)
Pan 3
 
0.1%
King Kong 3
 
0.1%
Ben-Hur 3
 
0.1%
The Fast and the Furious 3
 
0.1%
Halloween 3
 
0.1%
Home 3
 
0.1%
Victor Frankenstein 3
 
0.1%
Day of the Dead 2
 
< 0.1%
Aloha 2
 
< 0.1%
Halloween II 2
 
< 0.1%
Other values (4907)5016
99.5%

Length

2022-01-14T02:04:08.339081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the1606
 
11.5%
of483
 
3.5%
a193
 
1.4%
and150
 
1.1%
in123
 
0.9%
to107
 
0.8%
2104
 
0.7%
81
 
0.6%
man66
 
0.5%
love56
 
0.4%
Other values (4905)10987
78.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_voted_users
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4826
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83668.16082
Minimum5
Maximum1689764
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:08.480703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile514.6
Q18593.5
median34359
Q396309
95-th percentile332254.9
Maximum1689764
Range1689759
Interquartile range (IQR)87715.5

Descriptive statistics

Standard deviation138485.2568
Coefficient of variation (CV)1.655172714
Kurtosis24.44552017
Mean83668.16082
Median Absolute Deviation (MAD)30816
Skewness4.029871144
Sum421938535
Variance1.917816635 × 1010
MonotonicityNot monotonic
2022-01-14T02:04:08.620329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
575
 
0.1%
64
 
0.1%
60253
 
0.1%
3743
 
0.1%
533
 
0.1%
31193
 
0.1%
623
 
0.1%
1623
 
0.1%
25413
 
0.1%
83
 
0.1%
Other values (4816)5010
99.3%
ValueCountFrequency (%)
52
< 0.1%
64
0.1%
72
< 0.1%
83
0.1%
101
 
< 0.1%
131
 
< 0.1%
152
< 0.1%
161
 
< 0.1%
182
< 0.1%
191
 
< 0.1%
ValueCountFrequency (%)
16897641
< 0.1%
16761691
< 0.1%
14682001
< 0.1%
13474611
< 0.1%
13246801
< 0.1%
12512221
< 0.1%
12387461
< 0.1%
12177521
< 0.1%
12157181
< 0.1%
11557701
< 0.1%

cast_total_fb_likes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3978
Distinct (%)78.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9699.063851
Minimum0
Maximum656730
Zeros33
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:08.773918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile179
Q11411
median3090
Q313756.5
95-th percentile36927.7
Maximum656730
Range656730
Interquartile range (IQR)12345.5

Descriptive statistics

Standard deviation18163.79912
Coefficient of variation (CV)1.872737349
Kurtosis361.2551153
Mean9699.063851
Median Absolute Deviation (MAD)2302
Skewness12.83192773
Sum48912379
Variance329923598.6
MonotonicityNot monotonic
2022-01-14T02:04:08.925513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
033
 
0.7%
57
 
0.1%
20206
 
0.1%
26
 
0.1%
10445
 
0.1%
6735
 
0.1%
295
 
0.1%
23214
 
0.1%
15544
 
0.1%
6464
 
0.1%
Other values (3968)4964
98.4%
ValueCountFrequency (%)
033
0.7%
26
 
0.1%
31
 
< 0.1%
42
 
< 0.1%
57
 
0.1%
62
 
< 0.1%
71
 
< 0.1%
82
 
< 0.1%
101
 
< 0.1%
112
 
< 0.1%
ValueCountFrequency (%)
6567301
< 0.1%
3037171
< 0.1%
2839391
< 0.1%
2635841
< 0.1%
2618181
< 0.1%
1701181
< 0.1%
1402681
< 0.1%
1377121
< 0.1%
1207971
< 0.1%
1080161
< 0.1%

actor_3_name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct3521
Distinct (%)70.1%
Missing23
Missing (%)0.5%
Memory size39.5 KiB
Steve Coogan
 
8
Ben Mendelsohn
 
8
John Heard
 
8
Anne Hathaway
 
7
Lois Maxwell
 
7
Other values (3516)
4982 

Length

Max length29
Median length13
Mean length13.08227092
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2648 ?
Unique (%)52.7%

Sample

1st rowWes Studi
2nd rowJack Davenport
3rd rowStephanie Sigman
4th rowJoseph Gordon-Levitt
5th rowPolly Walker

Common Values

ValueCountFrequency (%)
Steve Coogan8
 
0.2%
Ben Mendelsohn8
 
0.2%
John Heard8
 
0.2%
Anne Hathaway7
 
0.1%
Lois Maxwell7
 
0.1%
Stephen Root7
 
0.1%
Jon Gries7
 
0.1%
Robert Duvall7
 
0.1%
Sam Shepard7
 
0.1%
Kirsten Dunst7
 
0.1%
Other values (3511)4947
98.1%
(Missing)23
 
0.5%

Length

2022-01-14T02:04:09.080099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
michael86
 
0.8%
john80
 
0.8%
david70
 
0.7%
james69
 
0.7%
robert46
 
0.4%
tom43
 
0.4%
paul42
 
0.4%
kevin41
 
0.4%
peter38
 
0.4%
steve36
 
0.3%
Other values (4307)9842
94.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

facenumber_in_poster
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)0.4%
Missing13
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean1.371172962
Minimum0
Maximum43
Zeros2152
Zeros (%)42.7%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:09.184819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum43
Range43
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.01357592
Coefficient of variation (CV)1.468506144
Kurtosis52.03373533
Mean1.371172962
Median Absolute Deviation (MAD)1
Skewness4.384765939
Sum6897
Variance4.054487986
MonotonicityNot monotonic
2022-01-14T02:04:09.280563image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
02152
42.7%
11251
24.8%
2716
 
14.2%
3380
 
7.5%
4207
 
4.1%
5114
 
2.3%
676
 
1.5%
748
 
1.0%
837
 
0.7%
918
 
0.4%
Other values (9)31
 
0.6%
(Missing)13
 
0.3%
ValueCountFrequency (%)
02152
42.7%
11251
24.8%
2716
 
14.2%
3380
 
7.5%
4207
 
4.1%
5114
 
2.3%
676
 
1.5%
748
 
1.0%
837
 
0.7%
918
 
0.4%
ValueCountFrequency (%)
431
 
< 0.1%
311
 
< 0.1%
191
 
< 0.1%
156
 
0.1%
141
 
< 0.1%
132
 
< 0.1%
124
 
0.1%
115
 
0.1%
1010
0.2%
918
0.4%

plot_keywords
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct4760
Distinct (%)97.3%
Missing153
Missing (%)3.0%
Memory size39.5 KiB
based on novel
 
4
eighteen wheeler|illegal street racing|truck|trucker|undercover cop
 
3
animal name in title|ape abducts a woman|gorilla|island|king kong
 
3
1940s|child hero|fantasy world|orphan|reference to peter pan
 
3
alien friendship|alien invasion|australia|flying car|mother daughter relationship
 
3
Other values (4755)
4874 

Length

Max length149
Median length50
Mean length52.42699387
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4639 ?
Unique (%)94.9%

Sample

1st rowavatar|future|marine|native|paraplegic
2nd rowgoddess|marriage ceremony|marriage proposal|pirate|singapore
3rd rowbomb|espionage|sequel|spy|terrorist
4th rowdeception|imprisonment|lawlessness|police officer|terrorist plot
5th rowalien|american civil war|male nipple|mars|princess

Common Values

ValueCountFrequency (%)
based on novel4
 
0.1%
eighteen wheeler|illegal street racing|truck|trucker|undercover cop3
 
0.1%
animal name in title|ape abducts a woman|gorilla|island|king kong3
 
0.1%
1940s|child hero|fantasy world|orphan|reference to peter pan3
 
0.1%
alien friendship|alien invasion|australia|flying car|mother daughter relationship3
 
0.1%
one word title3
 
0.1%
halloween|masked killer|michael myers|slasher|trick or treat3
 
0.1%
assistant|experiment|frankenstein|medical student|scientist3
 
0.1%
anti war|liverpool|love|protest|song2
 
< 0.1%
army|greek mythology|hercules|king|mercenary2
 
< 0.1%
Other values (4750)4861
96.4%
(Missing)153
 
3.0%

Length

2022-01-14T02:04:09.410217image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in331
 
1.8%
of222
 
1.2%
on209
 
1.2%
the191
 
1.1%
a185
 
1.0%
to180
 
1.0%
york122
 
0.7%
based106
 
0.6%
female104
 
0.6%
by99
 
0.5%
Other values (11486)16269
90.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

movie_imdb_link
Categorical

HIGH CARDINALITY
UNIFORM

Distinct4919
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
http://www.imdb.com/title/tt0077651/?ref_=fn_tt_tt_1
 
3
http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_1
 
3
http://www.imdb.com/title/tt0232500/?ref_=fn_tt_tt_1
 
3
http://www.imdb.com/title/tt3332064/?ref_=fn_tt_tt_1
 
3
http://www.imdb.com/title/tt1976009/?ref_=fn_tt_tt_1
 
3
Other values (4914)
5028 

Length

Max length52
Median length52
Mean length52
Min length52

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4802 ?
Unique (%)95.2%

Sample

1st rowhttp://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1
2nd rowhttp://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1
3rd rowhttp://www.imdb.com/title/tt2379713/?ref_=fn_tt_tt_1
4th rowhttp://www.imdb.com/title/tt1345836/?ref_=fn_tt_tt_1
5th rowhttp://www.imdb.com/title/tt5289954/?ref_=fn_tt_tt_1

Common Values

ValueCountFrequency (%)
http://www.imdb.com/title/tt0077651/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt0232500/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt3332064/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt1976009/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt2224026/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt2638144/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt4178092/?ref_=fn_tt_tt_12
 
< 0.1%
http://www.imdb.com/title/tt0264935/?ref_=fn_tt_tt_12
 
< 0.1%
http://www.imdb.com/title/tt0844708/?ref_=fn_tt_tt_12
 
< 0.1%
Other values (4909)5016
99.5%

Length

2022-01-14T02:04:09.578766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
http://www.imdb.com/title/tt0077651/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt0232500/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt3332064/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt1976009/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt2224026/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt2638144/?ref_=fn_tt_tt_13
 
0.1%
http://www.imdb.com/title/tt0872230/?ref_=fn_tt_tt_12
 
< 0.1%
http://www.imdb.com/title/tt0056193/?ref_=fn_tt_tt_12
 
< 0.1%
http://www.imdb.com/title/tt1742334/?ref_=fn_tt_tt_12
 
< 0.1%
Other values (4909)5016
99.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_user_for_reviews
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct954
Distinct (%)19.0%
Missing21
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean272.7708084
Minimum1
Maximum5060
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:09.683486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10
Q165
median156
Q3326
95-th percentile907.8
Maximum5060
Range5059
Interquartile range (IQR)261

Descriptive statistics

Standard deviation377.9828856
Coefficient of variation (CV)1.385716044
Kurtosis26.43829739
Mean272.7708084
Median Absolute Deviation (MAD)113
Skewness4.121475159
Sum1369855
Variance142871.0618
MonotonicityNot monotonic
2022-01-14T02:04:09.956755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
151
 
1.0%
333
 
0.7%
2632
 
0.6%
232
 
0.6%
1029
 
0.6%
628
 
0.6%
5026
 
0.5%
3225
 
0.5%
825
 
0.5%
3124
 
0.5%
Other values (944)4717
93.5%
ValueCountFrequency (%)
151
1.0%
232
0.6%
333
0.7%
423
0.5%
519
 
0.4%
628
0.6%
717
 
0.3%
825
0.5%
923
0.5%
1029
0.6%
ValueCountFrequency (%)
50601
< 0.1%
46671
< 0.1%
41441
< 0.1%
36461
< 0.1%
35971
< 0.1%
35161
< 0.1%
34001
< 0.1%
32861
< 0.1%
31891
< 0.1%
30541
< 0.1%

language
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct47
Distinct (%)0.9%
Missing12
Missing (%)0.2%
Memory size39.5 KiB
English
4704 
French
 
73
Spanish
 
40
Hindi
 
28
Mandarin
 
26
Other values (42)
 
160

Length

Max length10
Median length7
Mean length6.980719539
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.4%

Sample

1st rowEnglish
2nd rowEnglish
3rd rowEnglish
4th rowEnglish
5th rowEnglish

Common Values

ValueCountFrequency (%)
English4704
93.3%
French73
 
1.4%
Spanish40
 
0.8%
Hindi28
 
0.6%
Mandarin26
 
0.5%
German19
 
0.4%
Japanese18
 
0.4%
Russian11
 
0.2%
Cantonese11
 
0.2%
Italian11
 
0.2%
Other values (37)90
 
1.8%
(Missing)12
 
0.2%

Length

2022-01-14T02:04:10.095384image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
english4704
93.5%
french73
 
1.5%
spanish40
 
0.8%
hindi28
 
0.6%
mandarin26
 
0.5%
german19
 
0.4%
japanese18
 
0.4%
russian11
 
0.2%
cantonese11
 
0.2%
italian11
 
0.2%
Other values (37)90
 
1.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

country
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct65
Distinct (%)1.3%
Missing5
Missing (%)0.1%
Memory size39.5 KiB
USA
3807 
UK
448 
France
 
154
Canada
 
126
Germany
 
97
Other values (60)
406 

Length

Max length20
Median length3
Mean length3.489281461
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)0.6%

Sample

1st rowUSA
2nd rowUSA
3rd rowUK
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA3807
75.5%
UK448
 
8.9%
France154
 
3.1%
Canada126
 
2.5%
Germany97
 
1.9%
Australia55
 
1.1%
India34
 
0.7%
Spain33
 
0.7%
China30
 
0.6%
Japan23
 
0.5%
Other values (55)231
 
4.6%

Length

2022-01-14T02:04:10.261940image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
usa3807
74.6%
uk448
 
8.8%
france154
 
3.0%
canada126
 
2.5%
germany100
 
2.0%
australia55
 
1.1%
india34
 
0.7%
spain33
 
0.6%
china30
 
0.6%
japan23
 
0.5%
Other values (63)294
 
5.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

content_rating
Categorical

HIGH CORRELATION
MISSING

Distinct18
Distinct (%)0.4%
Missing303
Missing (%)6.0%
Memory size39.5 KiB
R
2118 
PG-13
1461 
PG
701 
Not Rated
 
116
G
 
112
Other values (13)
232 

Length

Max length9
Median length2
Mean length2.813924051
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowPG-13
2nd rowPG-13
3rd rowPG-13
4th rowPG-13
5th rowPG-13

Common Values

ValueCountFrequency (%)
R2118
42.0%
PG-131461
29.0%
PG701
 
13.9%
Not Rated116
 
2.3%
G112
 
2.2%
Unrated62
 
1.2%
Approved55
 
1.1%
TV-1430
 
0.6%
TV-MA20
 
0.4%
X13
 
0.3%
Other values (8)52
 
1.0%
(Missing)303
 
6.0%

Length

2022-01-14T02:04:10.439464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
r2118
43.6%
pg-131461
30.1%
pg701
 
14.4%
not116
 
2.4%
rated116
 
2.4%
g112
 
2.3%
unrated62
 
1.3%
approved55
 
1.1%
tv-1430
 
0.6%
tv-ma20
 
0.4%
Other values (9)65
 
1.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

budget
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct444
Distinct (%)9.6%
Missing406
Missing (%)8.1%
Infinite0
Infinite (%)0.0%
Mean39389277.68
Minimum218
Maximum1.22155 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:10.624969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum218
5-th percentile500000
Q16000000
median20000000
Q343000000
95-th percentile130000000
Maximum1.22155 × 1010
Range1.221549978 × 1010
Interquartile range (IQR)37000000

Descriptive statistics

Standard deviation204247286.3
Coefficient of variation (CV)5.18535242
Kurtosis2773.195111
Mean39389277.68
Median Absolute Deviation (MAD)16000000
Skewness48.57751667
Sum1.826480806 × 1011
Variance4.171695398 × 1016
MonotonicityNot monotonic
2022-01-14T02:04:10.843384image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20000000179
 
3.5%
30000000146
 
2.9%
15000000146
 
2.9%
25000000143
 
2.8%
10000000142
 
2.8%
40000000133
 
2.6%
35000000121
 
2.4%
5000000111
 
2.2%
50000000104
 
2.1%
1200000096
 
1.9%
Other values (434)3316
65.8%
(Missing)406
 
8.1%
ValueCountFrequency (%)
2181
 
< 0.1%
11001
 
< 0.1%
14001
 
< 0.1%
32501
 
< 0.1%
45001
 
< 0.1%
70003
0.1%
90001
 
< 0.1%
100003
0.1%
130001
 
< 0.1%
140001
 
< 0.1%
ValueCountFrequency (%)
1.22155 × 10101
< 0.1%
42000000001
< 0.1%
25000000001
< 0.1%
24000000001
< 0.1%
21275198981
< 0.1%
11000000001
< 0.1%
10000000001
< 0.1%
7000000002
< 0.1%
6000000001
< 0.1%
5536320001
< 0.1%

title_year
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct91
Distinct (%)1.8%
Missing108
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean2002.470517
Minimum1916
Maximum2016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:10.994978image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1916
5-th percentile1979
Q11999
median2005
Q32011
95-th percentile2015
Maximum2016
Range100
Interquartile range (IQR)12

Descriptive statistics

Standard deviation12.47459892
Coefficient of variation (CV)0.006229604289
Kurtosis7.439212616
Mean2002.470517
Median Absolute Deviation (MAD)6
Skewness-2.29227335
Sum9882192
Variance155.6156182
MonotonicityNot monotonic
2022-01-14T02:04:11.135602image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2009260
 
5.2%
2014252
 
5.0%
2006239
 
4.7%
2013237
 
4.7%
2010230
 
4.6%
2015226
 
4.5%
2008225
 
4.5%
2011225
 
4.5%
2005221
 
4.4%
2012221
 
4.4%
Other values (81)2599
51.5%
ValueCountFrequency (%)
19161
< 0.1%
19201
< 0.1%
19251
< 0.1%
19271
< 0.1%
19292
< 0.1%
19301
< 0.1%
19321
< 0.1%
19332
< 0.1%
19341
< 0.1%
19351
< 0.1%
ValueCountFrequency (%)
2016106
2.1%
2015226
4.5%
2014252
5.0%
2013237
4.7%
2012221
4.4%
2011225
4.5%
2010230
4.6%
2009260
5.2%
2008225
4.5%
2007204
4.0%

actor_2_fb_likes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct917
Distinct (%)18.2%
Missing13
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean1651.754473
Minimum0
Maximum137000
Zeros55
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:11.282211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile26
Q1281
median595
Q3918
95-th percentile11000
Maximum137000
Range137000
Interquartile range (IQR)637

Descriptive statistics

Standard deviation4042.438863
Coefficient of variation (CV)2.447360627
Kurtosis256.7951889
Mean1651.754473
Median Absolute Deviation (MAD)317
Skewness9.884733179
Sum8308325
Variance16341311.96
MonotonicityNot monotonic
2022-01-14T02:04:11.418845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000309
 
6.1%
11000111
 
2.2%
2000100
 
2.0%
300076
 
1.5%
055
 
1.1%
1000047
 
0.9%
1400041
 
0.8%
1300040
 
0.8%
82637
 
0.7%
400034
 
0.7%
Other values (907)4180
82.9%
ValueCountFrequency (%)
055
1.1%
214
 
0.3%
314
 
0.3%
412
 
0.2%
510
 
0.2%
67
 
0.1%
74
 
0.1%
89
 
0.2%
913
 
0.3%
109
 
0.2%
ValueCountFrequency (%)
1370001
 
< 0.1%
290001
 
< 0.1%
270002
 
< 0.1%
250003
 
0.1%
230006
0.1%
2200011
0.2%
210004
 
0.1%
200006
0.1%
190007
0.1%
180009
0.2%

imdb_score
Real number (ℝ≥0)

HIGH CORRELATION

Distinct78
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.442137616
Minimum1.6
Maximum9.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:11.561463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.6
5-th percentile4.4
Q15.8
median6.6
Q37.2
95-th percentile8.09
Maximum9.5
Range7.9
Interquartile range (IQR)1.4

Descriptive statistics

Standard deviation1.125115866
Coefficient of variation (CV)0.1746494615
Kurtosis0.9356915064
Mean6.442137616
Median Absolute Deviation (MAD)0.7
Skewness-0.7414713363
Sum32487.7
Variance1.265885711
MonotonicityNot monotonic
2022-01-14T02:04:11.689122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.7223
 
4.4%
6.6201
 
4.0%
7.2195
 
3.9%
6.5186
 
3.7%
6.4185
 
3.7%
7.3184
 
3.6%
7184
 
3.6%
7.1181
 
3.6%
6.8181
 
3.6%
6.1179
 
3.5%
Other values (68)3144
62.3%
ValueCountFrequency (%)
1.61
 
< 0.1%
1.71
 
< 0.1%
1.93
0.1%
22
< 0.1%
2.13
0.1%
2.23
0.1%
2.33
0.1%
2.42
< 0.1%
2.52
< 0.1%
2.62
< 0.1%
ValueCountFrequency (%)
9.51
 
< 0.1%
9.31
 
< 0.1%
9.21
 
< 0.1%
9.13
 
0.1%
93
 
0.1%
8.95
 
0.1%
8.87
 
0.1%
8.713
0.3%
8.615
0.3%
8.524
0.5%

aspect_ratio
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct22
Distinct (%)0.5%
Missing329
Missing (%)6.5%
Infinite0
Infinite (%)0.0%
Mean2.220403055
Minimum1.18
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:11.805810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.18
5-th percentile1.66
Q11.85
median2.35
Q32.35
95-th percentile2.35
Maximum16
Range14.82
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation1.385112535
Coefficient of variation (CV)0.6238113087
Kurtosis90.65322055
Mean2.220403055
Median Absolute Deviation (MAD)0
Skewness9.390056312
Sum10466.98
Variance1.918536735
MonotonicityNot monotonic
2022-01-14T02:04:11.898562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2.352360
46.8%
1.851906
37.8%
1.78110
 
2.2%
1.37100
 
2.0%
1.3368
 
1.3%
1.6664
 
1.3%
1645
 
0.9%
2.215
 
0.3%
2.3915
 
0.3%
47
 
0.1%
Other values (12)24
 
0.5%
(Missing)329
 
6.5%
ValueCountFrequency (%)
1.181
 
< 0.1%
1.21
 
< 0.1%
1.3368
1.3%
1.37100
2.0%
1.441
 
< 0.1%
1.52
 
< 0.1%
1.6664
1.3%
1.753
 
0.1%
1.771
 
< 0.1%
1.78110
2.2%
ValueCountFrequency (%)
1645
 
0.9%
47
 
0.1%
2.763
 
0.1%
2.552
 
< 0.1%
2.43
 
0.1%
2.3915
 
0.3%
2.352360
46.8%
2.241
 
< 0.1%
2.215
 
0.3%
25
 
0.1%

movie_fb_likes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct876
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7525.964505
Minimum0
Maximum349000
Zeros2181
Zeros (%)43.2%
Negative0
Negative (%)0.0%
Memory size39.5 KiB
2022-01-14T02:04:12.024226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median166
Q33000
95-th percentile40000
Maximum349000
Range349000
Interquartile range (IQR)3000

Descriptive statistics

Standard deviation19320.44511
Coefficient of variation (CV)2.567171968
Kurtosis41.33443692
Mean7525.964505
Median Absolute Deviation (MAD)166
Skewness5.05892689
Sum37953439
Variance373279599.2
MonotonicityNot monotonic
2022-01-14T02:04:12.163852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02181
43.2%
1000109
 
2.2%
1100083
 
1.6%
1000081
 
1.6%
1200062
 
1.2%
1300058
 
1.2%
200056
 
1.1%
1500053
 
1.1%
1400050
 
1.0%
1600047
 
0.9%
Other values (866)2263
44.9%
ValueCountFrequency (%)
02181
43.2%
22
 
< 0.1%
31
 
< 0.1%
45
 
0.1%
52
 
< 0.1%
73
 
0.1%
81
 
< 0.1%
93
 
0.1%
102
 
< 0.1%
112
 
< 0.1%
ValueCountFrequency (%)
3490001
< 0.1%
1990001
< 0.1%
1970001
< 0.1%
1910001
< 0.1%
1900001
< 0.1%
1750001
< 0.1%
1660001
< 0.1%
1650001
< 0.1%
1640001
< 0.1%
1530001
< 0.1%

Interactions

2022-01-14T02:04:01.055561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:28.878616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.156524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.310762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.391199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.279149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.448348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.736229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.880494image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.294039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.403399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.367147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.477503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.633736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.701206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.942213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.238073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.033203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.328065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.466347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.518857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.519507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.566033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.862890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.002169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.415714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.555990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.548661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.602170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.777351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.820886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.063888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.377699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.150888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.438769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.615947image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.641529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.682072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.690700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.016479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.278431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.546365image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.708582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.678314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.716862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.918973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.960513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.172596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.539267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.297496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.545484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.719669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.750238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.792776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.851270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.146133image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.435011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.666045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.814299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.822927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.828563image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.041645image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.091163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.296267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.649972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.408200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.685110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.821397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.884878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.929410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.989900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.269802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.594584image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.795698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.920016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.928645image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.974174image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.163319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.216827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.445866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.770648image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.561789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.829723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.938085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.994585image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.050088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.102598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.417407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.718254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.919367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.044683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.135093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.105822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.304940image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.356454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.559562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:01.893321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.678477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:31.953393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.052778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.138201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.165778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.219286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.548058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:44.895779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.033063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.157382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.262752image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.217523image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.424620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.617755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.673258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.028957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.822093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.089030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.290144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.252894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.295432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.343953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.670730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.020446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.152743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.281051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.409360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.360142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.566242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.756384image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.793935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.185539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:29.947757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.201729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.410821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.380553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.429074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.493553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.822325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.142120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.270429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.397739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.550980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.499768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.684924image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.872075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:59.911620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.297240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.096359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.328390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.520528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.481283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.552743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.602262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:42.936020image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.278755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.414044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.503456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.658692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.633411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.809591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:57.991755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.043268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.431880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.256930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.448070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.658160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.586003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.671426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:40.827659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.058692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.426360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.519761image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.628123image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.787348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.794979image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:55.923287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.121408image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.153972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.556547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.401543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.628587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.772852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.695710image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.810056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.001195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.180367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.602888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.646423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.742816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:51.903039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:53.917651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.070892image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.265024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.285620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.679219image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.522220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.771206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:34.890538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.807411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:38.924748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.134839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.322985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.714589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.857858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.849531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.014740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.040323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.184588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.415621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.446190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.857741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.645890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:32.934768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.009220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:36.942051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.065372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.301392image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.502506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.840253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:47.994492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:49.970208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.133422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.165986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.308257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.545274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.573849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:02.977421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.859319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.068411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.135882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.053753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.199015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.433040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.631162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:45.957938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.134119image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.104848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.246121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.407341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.456860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.677919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.814207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:03.120039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:30.979996image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:33.179115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:35.271519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:37.164456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:39.322685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:41.584635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:43.750841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:46.100557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:48.256791image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:50.225526image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:52.359818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:54.518045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:56.578534image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:03:58.792613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-14T02:04:00.928899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-14T02:04:12.415180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-14T02:04:12.636589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-14T02:04:12.855004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-14T02:04:13.088379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-14T02:04:13.256929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-14T02:04:03.401287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-14T02:04:04.324817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-14T02:04:04.846422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-14T02:04:05.375009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

colordirector_namenum_critic_for_reviewsdurationdirector_fb_likesactor_3_fb_likesactor_2_nameactor_1_fb_likesgrossgenresactor_1_namemovie_titlenum_voted_userscast_total_fb_likesactor_3_namefacenumber_in_posterplot_keywordsmovie_imdb_linknum_user_for_reviewslanguagecountrycontent_ratingbudgettitle_yearactor_2_fb_likesimdb_scoreaspect_ratiomovie_fb_likes
0ColorJames Cameron723.0178.00.0855.0Joel David Moore1000.0760505847.0Action|Adventure|Fantasy|Sci-FiCCH PounderAvatar8862044834Wes Studi0.0avatar|future|marine|native|paraplegichttp://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_13054.0EnglishUSAPG-13237000000.02009.0936.07.91.7833000
1ColorGore Verbinski302.0169.0563.01000.0Orlando Bloom40000.0309404152.0Action|Adventure|FantasyJohnny DeppPirates of the Caribbean: At World's End47122048350Jack Davenport0.0goddess|marriage ceremony|marriage proposal|pirate|singaporehttp://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_11238.0EnglishUSAPG-13300000000.02007.05000.07.12.350
2ColorSam Mendes602.0148.00.0161.0Rory Kinnear11000.0200074175.0Action|Adventure|ThrillerChristoph WaltzSpectre27586811700Stephanie Sigman1.0bomb|espionage|sequel|spy|terroristhttp://www.imdb.com/title/tt2379713/?ref_=fn_tt_tt_1994.0EnglishUKPG-13245000000.02015.0393.06.82.3585000
3ColorChristopher Nolan813.0164.022000.023000.0Christian Bale27000.0448130642.0Action|ThrillerTom HardyThe Dark Knight Rises1144337106759Joseph Gordon-Levitt0.0deception|imprisonment|lawlessness|police officer|terrorist plothttp://www.imdb.com/title/tt1345836/?ref_=fn_tt_tt_12701.0EnglishUSAPG-13250000000.02012.023000.08.52.35164000
4NaNDoug WalkerNaNNaN131.0NaNRob Walker131.0NaNDocumentaryDoug WalkerStar Wars: Episode VII - The Force Awakens8143NaN0.0NaNhttp://www.imdb.com/title/tt5289954/?ref_=fn_tt_tt_1NaNNaNNaNNaNNaNNaN12.07.1NaN0
5ColorAndrew Stanton462.0132.0475.0530.0Samantha Morton640.073058679.0Action|Adventure|Sci-FiDaryl SabaraJohn Carter2122041873Polly Walker1.0alien|american civil war|male nipple|mars|princesshttp://www.imdb.com/title/tt0401729/?ref_=fn_tt_tt_1738.0EnglishUSAPG-13263700000.02012.0632.06.62.3524000
6ColorSam Raimi392.0156.00.04000.0James Franco24000.0336530303.0Action|Adventure|RomanceJ.K. SimmonsSpider-Man 338305646055Kirsten Dunst0.0sandman|spider man|symbiote|venom|villainhttp://www.imdb.com/title/tt0413300/?ref_=fn_tt_tt_11902.0EnglishUSAPG-13258000000.02007.011000.06.22.350
7ColorNathan Greno324.0100.015.0284.0Donna Murphy799.0200807262.0Adventure|Animation|Comedy|Family|Fantasy|Musical|RomanceBrad GarrettTangled2948102036M.C. Gainey1.017th century|based on fairy tale|disney|flower|towerhttp://www.imdb.com/title/tt0398286/?ref_=fn_tt_tt_1387.0EnglishUSAPG260000000.02010.0553.07.81.8529000
8ColorJoss Whedon635.0141.00.019000.0Robert Downey Jr.26000.0458991599.0Action|Adventure|Sci-FiChris HemsworthAvengers: Age of Ultron46266992000Scarlett Johansson4.0artificial intelligence|based on comic book|captain america|marvel cinematic universe|superherohttp://www.imdb.com/title/tt2395427/?ref_=fn_tt_tt_11117.0EnglishUSAPG-13250000000.02015.021000.07.52.35118000
9ColorDavid Yates375.0153.0282.010000.0Daniel Radcliffe25000.0301956980.0Adventure|Family|Fantasy|MysteryAlan RickmanHarry Potter and the Half-Blood Prince32179558753Rupert Grint3.0blood|book|love|potion|professorhttp://www.imdb.com/title/tt0417741/?ref_=fn_tt_tt_1973.0EnglishUKPG250000000.02009.011000.07.52.3510000

Last rows

colordirector_namenum_critic_for_reviewsdurationdirector_fb_likesactor_3_fb_likesactor_2_nameactor_1_fb_likesgrossgenresactor_1_namemovie_titlenum_voted_userscast_total_fb_likesactor_3_namefacenumber_in_posterplot_keywordsmovie_imdb_linknum_user_for_reviewslanguagecountrycontent_ratingbudgettitle_yearactor_2_fb_likesimdb_scoreaspect_ratiomovie_fb_likes
5033ColorShane Carruth143.077.0291.08.0David Sullivan291.0424760.0Drama|Sci-Fi|ThrillerShane CarruthPrimer72639368Casey Gooden0.0changing the future|independent film|invention|nonlinear timeline|time travelhttp://www.imdb.com/title/tt0390384/?ref_=fn_tt_tt_1371.0EnglishUSAPG-137000.02004.045.07.01.8519000
5034ColorNeill Dela Llana35.080.00.00.0Edgar Tancangco0.070071.0ThrillerIan GamazonCavite5890Quynn Ton0.0jihad|mindanao|philippines|security guard|squatterhttp://www.imdb.com/title/tt0428303/?ref_=fn_tt_tt_135.0EnglishPhilippinesNot Rated7000.02005.00.06.3NaN74
5035ColorRobert Rodriguez56.081.00.06.0Peter Marquardt121.02040920.0Action|Crime|Drama|Romance|ThrillerCarlos GallardoEl Mariachi52055147Consuelo Gómez0.0assassin|death|guitar|gun|mariachihttp://www.imdb.com/title/tt0104815/?ref_=fn_tt_tt_1130.0SpanishUSAR7000.01992.020.06.91.370
5036ColorAnthony ValloneNaN84.02.02.0John Considine45.0NaNCrime|DramaRichard JewellThe Mongol King3693Sara Stepnicka0.0jewell|mongol|nostradamus|stepnicka|vallonehttp://www.imdb.com/title/tt0430371/?ref_=fn_tt_tt_11.0EnglishUSAPG-133250.02005.044.07.8NaN4
5037ColorEdward Burns14.095.00.0133.0Caitlin FitzGerald296.04584.0Comedy|DramaKerry BishéNewlyweds1338690Daniella Pineda1.0written and directed by cast memberhttp://www.imdb.com/title/tt1880418/?ref_=fn_tt_tt_114.0EnglishUSANot Rated9000.02011.0205.06.4NaN413
5038ColorScott Smith1.087.02.0318.0Daphne Zuniga637.0NaNComedy|DramaEric MabiusSigned Sealed Delivered6292283Crystal Lowe2.0fraud|postal worker|prison|theft|trialhttp://www.imdb.com/title/tt3000844/?ref_=fn_tt_tt_16.0EnglishCanadaNaNNaN2013.0470.07.7NaN84
5039ColorNaN43.043.0NaN319.0Valorie Curry841.0NaNCrime|Drama|Mystery|ThrillerNatalie ZeaThe Following738391753Sam Underwood1.0cult|fbi|hideout|prison escape|serial killerhttp://www.imdb.com/title/tt2071645/?ref_=fn_tt_tt_1359.0EnglishUSATV-14NaNNaN593.07.516.0032000
5040ColorBenjamin Roberds13.076.00.00.0Maxwell Moody0.0NaNDrama|Horror|ThrillerEva BoehnkeA Plague So Pleasant380David Chandler0.0NaNhttp://www.imdb.com/title/tt2107644/?ref_=fn_tt_tt_13.0EnglishUSANaN1400.02013.00.06.3NaN16
5041ColorDaniel Hsia14.0100.00.0489.0Daniel Henney946.010443.0Comedy|Drama|RomanceAlan RuckShanghai Calling12552386Eliza Coupe5.0NaNhttp://www.imdb.com/title/tt2070597/?ref_=fn_tt_tt_19.0EnglishUSAPG-13NaN2012.0719.06.32.35660
5042ColorJon Gunn43.090.016.016.0Brian Herzlinger86.085222.0DocumentaryJohn AugustMy Date with Drew4285163Jon Gunn0.0actress name in title|crush|date|four word title|video camerahttp://www.imdb.com/title/tt0378407/?ref_=fn_tt_tt_184.0EnglishUSAPG1100.02004.023.06.61.85456

Duplicate rows

Most frequently occurring

colordirector_namenum_critic_for_reviewsdurationdirector_fb_likesactor_3_fb_likesactor_2_nameactor_1_fb_likesgrossgenresactor_1_namemovie_titlenum_voted_userscast_total_fb_likesactor_3_namefacenumber_in_posterplot_keywordsmovie_imdb_linknum_user_for_reviewslanguagecountrycontent_ratingbudgettitle_yearactor_2_fb_likesimdb_scoreaspect_ratiomovie_fb_likes# duplicates
0Black and WhiteGeorge A. Romero284.096.00.056.0Duane Jones125.0236452.0Drama|Horror|MysteryJudith O'DeaNight of the Living Dead87978403S. William Hinzman5.0cemetery|farmhouse|radiation|running out of gas|zombiehttp://www.imdb.com/title/tt0063350/?ref_=fn_tt_tt_1580.0EnglishUSAUnrated114000.01968.0108.08.01.8502
1Black and WhiteYimou Zhang283.080.0611.0576.0Tony Chiu Wai Leung5000.084961.0Action|Adventure|HistoryJet LiHero1494146229Maggie Cheung4.0china|flying|king|palace|swordhttp://www.imdb.com/title/tt0299977/?ref_=fn_tt_tt_1841.0MandarinChinaPG-1331000000.02002.0643.07.92.3502
2ColorAlbert Hughes208.0122.0117.0140.0Jason Flemyng40000.031598308.0Horror|Mystery|ThrillerJohnny DeppFrom Hell12476541636Ian Richardson1.0freemason|jack the ripper|opium|prostitute|victorian erahttp://www.imdb.com/title/tt0120681/?ref_=fn_tt_tt_1541.0EnglishUSAR35000000.02001.01000.06.82.3502
3ColorAngelina Jolie Pitt322.0137.011000.0465.0Jack O'Connell769.0115603980.0Biography|Drama|Sport|WarFinn WittrockUnbroken1035892938Alex Russell0.0emaciation|male nudity|plane crash|prisoner of war|torturehttp://www.imdb.com/title/tt1809398/?ref_=fn_tt_tt_1351.0EnglishUSAPG-1365000000.02014.0698.07.22.35350002
4ColorBill Condon322.0115.0386.012000.0Kristen Stewart21000.0292298923.0Adventure|Drama|Fantasy|RomanceRobert PattinsonThe Twilight Saga: Breaking Dawn - Part 218539459177Taylor Lautner3.0battle|friend|super strength|vampire|visionhttp://www.imdb.com/title/tt1673434/?ref_=fn_tt_tt_1329.0EnglishUSAPG-13120000000.02012.017000.05.52.35650002
5ColorBrett Ratner245.0101.0420.0467.0Rufus Sewell12000.072660029.0Action|AdventureDwayne JohnsonHercules11568716235Ingrid Bolsø Berdal0.0army|greek mythology|hercules|king|mercenaryhttp://www.imdb.com/title/tt1267297/?ref_=fn_tt_tt_1269.0EnglishUSAPG-13100000000.02014.03000.06.02.35210002
6ColorBruce McCulloch52.085.054.0455.0Megan Mullally985.013973532.0Comedy|CrimeMartin StarrStealing Harvard112113065Chris Penn1.0black humor|crying during sex|harvard|humor|man with glasseshttp://www.imdb.com/title/tt0265808/?ref_=fn_tt_tt_192.0EnglishUSAPG-1325000000.02002.0637.05.11.852152
7ColorDanny Boyle393.0101.00.0888.0Spencer Wilding3000.02319187.0Crime|Drama|Mystery|ThrillerRosario DawsonTrance926405056Tuppence Middleton0.0amnesia|criminal|heist|hypnotherapy|lost paintinghttp://www.imdb.com/title/tt1924429/?ref_=fn_tt_tt_1212.0EnglishUKR20000000.02013.01000.07.02.35230002
8ColorDavid Yates248.0110.0282.0103.0Alexander Skarsgård11000.0124051759.0Action|Adventure|Drama|RomanceChristoph WaltzThe Legend of Tarzan4237221175Casper Crump2.0africa|capture|jungle|male objectification|tarzanhttp://www.imdb.com/title/tt0918940/?ref_=fn_tt_tt_1239.0EnglishUSAPG-13180000000.02016.010000.06.62.35290002
9ColorFrank Oz168.087.00.0548.0Ewen Bremner22000.08579684.0ComedyPeter DinklageDeath at a Funeral8954724324Kris Marshall0.0end credits roll call|four word title|funeral|secret|unclehttp://www.imdb.com/title/tt0795368/?ref_=fn_tt_tt_1199.0EnglishUSAR9000000.02007.0557.07.41.8502